Locally baseline detection for online Arabic script based languages character recognition
نویسندگان
چکیده
Baseline detection is one of the most important step in character recognition and has direct influence on recognition result. Due to the complexity of the Urdu scripts based languages, handwritten character recognition is a very difficult task as compared to other languages. Baseline detection is one of the main issue and basic step of mostly preprocessing operations that is, normalization, skewness, secondary strokes segmentation and also in feature extraction. This paper presents a novel method of baseline detection for cursive handwritten Urdu script. The proposed approach is divided into three steps: diacritical marks segmentation, primary baseline estimation and local baseline estimation. The local baseline extraction is estimated using the features extracted from ending shape of the words. Due to structural difference between Nasta'liq and Naskh style, different rules are formed for baseline estimation.
منابع مشابه
Lexicon Reduction for Urdu/Arabic Script Based Character Recognition: A Multilingual OCR
Arabic script character recognition is challenging task due to complexity of the script and huge number of ligatures. We present a method for the development of multilingual Arabic script OCR (Optical Character Recognition) and lexicon reduction for Arabic Script and its derivative languages. The objective of the proposed method is to overcome the large dataset Urdu and similar scripts by using...
متن کاملHMM Based Approach for Online Arabic Script Based Languages Character Recognition
-Arabic script based languages character recognition remains a challenging task due to its cursive nature. This task becomes more complex and demanding in case of handwritten Arabic script. We have used two layers HMM for recognition. We extracted directional and structural features for handwritten stroke and fused these feature to form more discriminant feature matrix. The fused feature matrix...
متن کاملMulti-font Numerals Recognition for Urdu Script based Languages
Handwritten character recognition of Urdu script based languages is one of the most difficult task due to complexities of the script. Urdu script based languages has not received much attestation even this script is used more than 1/6th of the population. The complexities in the script makes more complicated the recognition process. The problem in handwritten numeral recognition is the shape si...
متن کاملThe Optical Character Recognition for Cursive Script Using HMM: A Review
Automatic Character Recognition has wide variety of applications such as automatic postal mail sorting, number plate recognition and automatic form of reader and entering text from PDA's etc. Cursive script’s Automatic Character Recognition is a complex process facing unique issues unlike other scripts. Many solutions have been proposed in the literature to solve complexities of cursive scripts...
متن کاملThe State of the Art Recognize in Arabic Script through Combination of Online and Offline
Handwriting recognition refers to the identification of written characters. Handwriting recognition has become an acute research area in recent years for the ease of access of computer science. In this paper primarily discussed On-line and Off-line handwriting recognition methods for Arabic words which are often used among then across the Middle East and North Africa People. Arabic word online ...
متن کامل